Enriching very large ontologies using the WWW

نویسندگان

  • Eneko Agirre
  • Olatz Ansa
  • Eduard H. Hovy
  • David Martínez
چکیده

This paper explores the possibility to exploit text on the world wide web in order to enrich the concepts in existing ontologies. First, a method to retrieve documents from the WWW related to a concept is described. These document collections are used 1) to construct topic signatures (lists of topically related words) for each concept in WordNet, and 2) to build hierarchical clusters of the concepts (the word senses) that lexicalize a given word. The overall goal is to overcome two shortcomings of WordNet: the lack of topical links among concepts, and the proliferation of senses. Topic signatures are validated on a word sense disambiguation task with good results, which are improved when the hierarchical clusters are used.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enriching WordNet concepts with topic signatures

This paper explores the possibility of enriching the content of existing ontologies. The overall goal is to overcome the lack of topical links among concepts in WordNet. Each concept is to be associated to a topic signature, i.e., a set of related words with associated weights. The signatures can be automatically constructed from the WWW or from sense-tagged corpora. Both approaches are compare...

متن کامل

Enriching Ontology Concepts Based on Texts from WWW and Corpus

In spite of the growing of ontological engineering tools, ontology knowledge acquisition remains a highly manual, time-consuming and complex task. Automatic ontology learning is a well-established research field whose goal is to support the semi-automatic construction of ontologies starting from available digital resources (e.g., A corpus, web pages, dictionaries, semi-structured and structured...

متن کامل

Enriching Ontologies by Learned Negation - Or How to Teach Ontologies Vegetarianism

Ontologies form the basis of the semantic web by providing knowledge on concepts, relations and instances. Unfortunately, the manual creation of ontologies is a timeintensive and hence expensive task. This leads to the so-called knowledge acquisition bottleneck being a major problem for a more widespread adoption of the semantic web. Ontology learning tries to widen the bottleneck by supporting...

متن کامل

Towards a Framework for Approximate Ontologies

Currently, there is a great deal of interest in developing tools for the generation and use of ontologies on the WWW. These knowledge structures are considered essential to the success of the semantic web, the next phase in the evolution of the WWW. Much recent work with ontologies assumes that the concepts used as building blocks are crisp as opposed to approximate. It is a premise of this pap...

متن کامل

Enriching Top-down Geo-ontologies Using Bottom-up Knowledge Mined from Linked Data

Geo-ontologies provide formal specifications of geographic concepts, and can be embedded into geographic information systems to support automatic reasoning. Traditionally, geo-ontologies are developed through a top-down approach in which a group of experts collaboratively decide about the formalization. While such an approach captures valuable expert knowledge, the resulting geo-ontologies coul...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره cs.CL/0010026  شماره 

صفحات  -

تاریخ انتشار 2000